Using Wikipedia to Translate OOV Term on MLIR
نویسندگان
چکیده
We deal with Chinese, Japanese and Korean multilingual information retrieval (MLIR) in NTCIR-6, and submit our results on the C-CJK-T and C-CJK-D subtask. In these runs, we adopt Dictionary-Based Approach to translate query terms. In addition to tradition dictionary, we incorporate the Wikipedia as a live dictionary.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملUsing Wikipedia to translate domain-specific terms in SMT
When building a university lecture translation system, one important step is to adapt it to the target domain. One problem in this adaptation task is to acquire translations for domain specific terms. In this approach we tried to get these translations from Wikipedia, which provides articles on very specific topics in many different languages. To extract translations for the domain specific ter...
متن کاملRMIT Chinese-English CLIR at NTCIR-4
We participated in the Chinese-English CLIR task, concentrating primarily on the issues of translation disambiguation and automatic translation extraction of OOV terms. A new technique to identify and translate Chinese OOV terms using the web was developed. The results for this aspect of our work appears promising.
متن کاملSublexical Translations for Low-Resource Language
Machine Translation (MT) for low-resource language has low-coverage issues due to Out-OfVocabulary (OOV) Words. In this research we propose a method using sublexical translation to achieve wide-coverage in Example-Based Machine Translation (EBMT) for English to Bangla language. For sublexical translation we divide the OOV words into sublexical units for getting translation candidates. Previous ...
متن کاملUsing Sublexical Translations to Handle the OOV Problem in MT
We introduce a method for learning to translate out-of-vocabulary (OOV) words. The method focuses on combining sublexical/constituent translations of an OOV to generate its translation candidates. In our approach, wildcard searches are formulated based on our OOV analysis, aimed at maximizing the probability of retrieving OOVs’ sublexical translations from existing resource of machine translati...
متن کامل